-
Notifications
You must be signed in to change notification settings - Fork 764
use django-post-office
with a django_ses
backend for emails
#6533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use django-post-office
with a django_ses
backend for emails
#6533
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Configuration looks good. Don't we need though to use the send
method from post office? What are the next steps?
No, because the
I added a list of next steps in the introduction. |
4ca8027
to
548f16b
Compare
548f16b
to
d1904cc
Compare
@@ -1,3 +1,3 @@ | |||
#!/bin/bash | |||
|
|||
exec newrelic-admin run-program celery -A kitsune worker -Q celery,priority --max-tasks-per-child=${CELERY_WORKER_MAX_TASKS_PER_CHILD:-25} | |||
exec newrelic-admin run-program celery -A kitsune worker -Q "${1:-celery}" --max-tasks-per-child=${CELERY_WORKER_MAX_TASKS_PER_CHILD:-25} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nicely done
mozilla/sumo#1993
mozilla/sumo#2215
This PR should solve our email issues.
Important Notes
post_office
email backend simply queues emails in the DB. You have to tell it what "real" email backend to use for actually sending the emails. In this case, we're using thedjango_ses.SESBackend
. Whenpost_office
creates an instance of the "real" backend in order to actually send the emails, it always creates it withfail_silently=False
, because it captures any exception when trying to send an email, marks that email as "failed", and schedules it for a retry. So, it really doesn't matter what value forfail_silently
you pass to thepost_office.EmailBackend
, it's always ignored by that backend.post_office
queues a Celery task -- because we're setting itsCELERY_ENABLED
setting toTrue
. This means that its built-inpost_office.tasks.send_queued_mail
Celery task is kicked-off each time either thepost_office.EmailBackend.send_messages
method or thepost_office.mail.send
function is called. It's important to keep in mind that thepost_office.tasks.send_queued_mail
Celery task is not given a specific list of queued emails to send. Instead, it runs a loop of querying the database for a batch of queued emails, sending that batch using the real backend (in this casedjango_ses.SESBackend
), updating the status of the emails in the DB, creating DB logs, and so on until its DB query for the next batch of emails returns nothing. The important point I'm trying to make here is that each time we callsend_messages
in our code, we're kicking off a Celery task that will work on all queued emails, not just the ones in thesend_messages
call. That's why in this PR I've modified our calls tosend_messages
to always pass in the entire set of messages we want to send, so no longer callingsend_messages
for each individual message.post_office.tasks.send_queued_mail
task assumes that all of the Celery workers that could handle that task share the same file system. If not, its file-based locking system fails to limit the execution ofpost_office.mail.send_queued
to a single run at a time, which in turn means that it's highly likely that duplicate emails would be sent. In our case, each of our Celery workers is its own pod, so only the 4 worker processes within that pod share the same file system, but not all of our workers. The root cause of this issue is that thesend_queued
function does not work on a specific set of emails, but instead, as mentioned earlier, its work consists of querying the DB for a batch of emails, sending them, and then updating their status, and that work is not done atomically. So if more than onesend_queued
function is running at the same time (highly likely given our current state), it's likely that they grab the same batch of emails and send them. The solution I've taken is to create a separate Celery queue for emails -- the newemail
queue -- as well as another Celery K8sdeployment
with a single replica (pod) that pulls only from theemail
queue (see https://github.com/mozilla-it/webservices-infra/pull/4127). Also, the other Celery workers only pull tasks from the defaultcelery
queue, not from theemail
queue, so effectively, only a single pod is ever pulling tasks from theemail
queue.post_office
to log (in the database) both successful and failed emails (which is the default actually). The log entries are available via the admin interface.post_office
provides theOVERRIDE_RECIPIENTS
setting, which specifies one or more email addresses that you'd like to always use instead of an email's actual recipient. This PR goes a step further -- mimicking what thedjango-email-bandit
package does -- by indicating the original recipient within the email's text and HTML alternatives. This is really important, because it allows us to perform QA on stage using the real email system without sending unwanted emails. On stage, all emails would go to the email(s) defined in theOVERRIDE_RECIPIENTS
setting. This also means that we no longer need thedjango-email-bandit
package.Next Steps
sumo-ses-v2-send-and-getaccount
for SES v2 access. This has already been done.sumo-ses-v2-send-and-getaccount
IAM policy to theses-smtp-user.prod
user.ses-smtp-user.prod
user.sumo-ses-v2-send-and-getaccount
IAM policy to theses-smtp-user.dev
andses-smtp-user.stage
users.ses-smtp-user.dev
andses-smtp-user.stage
users.dev
,stage
, andprod
(and don't forget the Europeanprod
secrets) in GCP.POST_OFFICE_OVERRIDE_RECIPIENTS
secret to thedev
andstage
secrets in GCP. It should be set to the same value as is currently defined for theBANDIT_EMAIL
setting.POST_OFFICE_OVERRIDE_RECIPIENTS
recipients -- we can use stage to QA the real email system.Post-Release Steps
webservices-infra
that removes theEMAIL_LOGGING_REAL_BACKEND
setting fromvalues-dev.yaml
,values-stage.yaml
,values-stage-europe-west1.yaml
,values-prod.yaml
, andvalues-prod-europe-west1.yaml
.django-email-bandit
package.BANDIT_EMAIL
setting within thedev
andstage
secrets.